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Abstract 

We consider the complexity for computing the approximate sum ai + 02 + • • • + a„ 
of a sorted list of numbers ai < 02 < ■ ■ ■ < a„. We show an algorithm that com- 
putes an (1 + e)-approximation for the sum of a sorted list of nonnegative numbers in an 
I min(log n, log( ^^""^ )) • (log § + log log n)) time, where Xmax and Xmin are the largest 

, and the least positive elements of the input list, respectively. We prove a lower bound 

• ■ n(min(log n, log( ^'"°°' )) time for every 0(l)-approximation algorithm for the sum of a sorted list 

^ \ of nonnegative elements. We also show that there is no sublinear time approximation algorithm 

for the sum of a sorted list that contains at least one negative number. 

^: 

1. Introduction 

(N 

in 



Computing the sum of a list of numbers is a classical problem that is often found inside the high 
f"^ ■ school textbooks. There is a famous story about Karl Friedrich Gauss who computed \ + 2-\-- ■ --1-100 
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via rearranging these terms into (1 -I- 100) -t- (2 -t- 99) + ... -|- (50 + 51) = 50 x 101, when he was seven 
years old, attending elementary school. Such a method is considered an efficient algorithm for 
computing a class of lists of increasing numbers. Computing the sum of a list of elements has many 
applications, and is ubiquitous in software design. In the classical mathematics, many functions can 
be approximated by the sum of simple functions via Taylor expansion. This kind of approximation 
. , theories is in the core area of mathematical analysis. In this article we consider if there is an efficient 

r> , way to compute the sum of a general list of nonnegative numbers with nondecreasing order. 

■ Let e be a real number at least 0. Real number s is an (1 + e)-approximation for the sum 

problem ai, 02, • • • , a„ if ^J=^ < s < (1 + e) X]"=i '^i- Approximate sum problem was studied 
in the randomized computation model. Every 0(l)-approximation algorithm with uniform random 
sampling requires VL{n) time in the worst case if the list of numbers in [0, 1] is not sorted. Using 
0(^ log J-) random samples, one can compute the (1 -|-e)-approximation for the mean, or decide if it 
is at most 5 for a list numbers in [0, 1] [H]. Canetti, Even, and Goldreich [5] showed that the sample 
size is tight. Motwani, Panigrahy, and Xu [14] showed an 0(-\/n) time approximation scheme for 
computing the sum of n nonnegative elements. There is a long history of research for the accuracy 
of summation of floating point numbers (for examples, see piU [T| [tl [51 [SI [5| [TTl [T^ [TOl [TSl [TB] ) . 
The efforts were mainly spent on finding algorithms with small rounding errors. 

We investigate the complexity for computing the approximate sum of a sorted list. When we have 
a large number of data items and need to compute the sum, an efficient approximation algorithm 
becomes important. Par-Heled developed an coreset approach for a more general problem. The 
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method used in his paper implies an 0{ °^" ) time approximation algorithm for the approximate 
sum of sorted nonncgativc numbers [7]- The coreset is a subset of numbers selected from a sorted 
input list, and their positions only depends on the size n of the list, and independent of the numbers. 
The coreset of a list of n sorted nonnegative numbers has a size fl(logn). This requires the algorithm 
time to be also 51(logn) under all cases. 

We show an algorithm that gives an (1 + e)-approximation for the sum of a list of sorted non- 
negative elements in min(logT7,, log( ^'"°^ )) • (log ^ + loglogri)) time, where Xmax and Xmin 
are the largest and the least positive elements of the input list, respectively. This algorithm has a 

comparable complexity with Par-Heled's algorithm. Our algorithm is of sub-logarithm complexity 
1 

when < n(i°i!i°8")^+° for any fixed a > 0. The algorithm is based on a different method, which 
is a quadratic region search algorithm, from the coreset construction used in [7] . 

We also prove a lower bound ^2(min(log rt, log(f™^)) for this problem. We first derive an 
O(loglogn) time approximation algorithm that finds an approximate region of the list for hold- 
ing the items of size at least a threshold b. Our approximate sum algorithm is derived with it as 
a submodule. We also show an r2(loglogri,) lower bound for approximate region algorithms for the 
sum of a sorted list with only nonnegative elements. 

In Section[2l we present an algorithm that computes (l-t-e)-approximation for the sum of a sorted 
list of nonnegative numbers in 0(i min(log n, log(|^^^)) • (log ^ -I- log log n)) time, where Xmax and 
Xmin are the largest and the least positive elements of the input list, respectively. In Section [31 we 
present lower bounds related to the sum of sorted list. In Section|4l we show the experimental results 
for the implementation of our algorithm in Section [2] This paper contains self-contained proofs for 
all its results. 

2. Algorithm for Approximate Sum of Sorted List 

In this section, we show a deterministic algorithm for the sorted elements. We first show an ap- 
proximation to find an approximate region of a sorted list with elements of size at least threshold 
b. 

A crucial part of our approximate algorithm for the sum of sorted list is to find an approximate 
region with elements of size at least a threshold b. We develop a method that is much faster than 
binary search and it takes 0(log j- -I- log log n) time to find the approximate region. We first apply 
the square function to expand the region and use the square root function to narrow down to a 
region that only has (1 + 15) factor difference with the exact region. The parameter S determines the 
accuracy of approximation. 

Definition 1. For i < j, let be the number of integers in the interval 

If both i and j are integers with i < j, we have \ = j — i + 1. 

Definition 2. A list X oi n numbers is represented by an array Ar[l,n], which has n numbers 
AT [1] , X [2] , • • • , X [n] . For integers i < j, let AT [i , j'] be the sublist that contains elements X [i] , X [i -|- 
1], • • • For an interval i? = [hj], denote X[R] to be X[i,j]. 

Definition 3. For a sorted list Ar[l,n] with nonnegative elements by nondecreasing order and a 
threshold b, the b-region is an interval [ti', n] such that X[n' , n] are the numbers at least b in A"[l, n]. 
An (1 -|- 5) -approximation for the b-region is a region R = [s, n], which contains the last position n of 
Ar[l, n], such that at least numbers in X[s, n] are at least b, and [s, n] contains all every position 
j with X[j] > b, where |i?| is the number of integers i in R. 
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2.1. Approximate Region 

The approximation algorithm for finding an approximate 6-region to contain the elements at least 
a threshold b has two loops. The first loop searches the region by increasing the parameter m via 
the square function. When the region is larger than the exact region, the second loop is entered. It 
converges to the approximate region with a factor that goes down by a square root each cycle. Using 
the combination of the square and square root functions makes our algorithm much faster than the 
binary search. 

In order to simplify the description of the algorithm Approximate-Region(.), we assume X[i] = 
— oo for every i < 0. It can save the space for the boundary checking when accessing the list X. 
The description of the algorithm is mainly based on the consideration for its proof of correctness. 
For a real number a, denote [aj to be the largest integer at most a, and \a\ to be the least integer 
at least a. For examples, [3.7J ~ 3, and [3.7] — A. 

Algorithm Approximate- Region(A", b, 6, n) 

Input: is a sorted list of n numbers by nondecreasing order; n is the size of X[l,n]; b is 

a threshold in (0, +oo); and (5 is a parameter in (0, +oo). 



1. if {X[n] < b), return 0; 

2. if {X[n - 1] < 5), return [n,n]; 

3. if > fo), return 

4. let mi := 2; 

5. while (X[n-m2 + l] >6) { 

6. let m := rr?\ 

7. }; 

8. let i := 1; 

9. let mi :~ m; 

10. let ri := m; 

11. while (m^ > 1 + (5) { 

12. let 77ii+i ;= Y^m7; 

13. if {X[n — [mi_|_irij + 1] > b), then let r^+i :— mi^iri] 

14. else r.i+i := n; 

15. let i := i + 1; 

16. }; 

17. return [n — [m^r^J + 1, n]; 



End of Algorithm 

Lemma 4. Let S be a parameter in (0, 1). Then there is an 0((log ^) + (log log n)) time algorithm 
such that given an element b, and a list A of sorted n elements, it finds an (1 + S)- approximate 
b-region. 



3 



Proof: After the first phase (lines [T] to [7]) of the algorithm, we obtain number m such that 

X[n~m+1] > b, and (1) 
X[n-m^ + l] < b. (2) 

As we already assume X[i\ = —oo for every i < 0, there is no boundary problem for assessing 
the input list. The variable m is an integer in the first phase. Thus, the boundary point for the 
region with numbers at least the threshold b is in [n — iri^ + f , n — m + 1] . The variable m can be 
expressed as 2^ for some integer k > Q after executing k cycles in the first phase. Thus, the first 
phase takes O(loglogn) time because m is increased to at each cycle of the first while loop, and 
2^ > n for A: > log log n. 

In the second phase f lines [8] to fT7|) of the algorithm, we can prove that X[n — [r^J + 1] > & and 
X[n— \_miri\ + 1] < 6 at the end of every cycle (right after executing the statement at line [T5|) of the 
second loop (lines fTD to [T6|) . Thus, the boundary point for the region with elements at the threshold 
b is in [n — [mir^J +\,n-~ [r^J + 1]. The variable is not an integer after < 2 in the algorithm. 
It can be verified via a simple induction. It is true before entering the second loop flincs [TT] to fTB]) 
by inequahties ([!]) and ([2]). Assume that at the end of cycle i, 

X[n^[ri\+l] > 6; and (3) 
X[n-lm,r,\+l] < b. (4) 

Let us consider cycle i + 1 at the second loop. Let mi+i = y/rrii . 

i. Case 1: X[n — [jni+ir^J + 1] > 6. Let r^+i = mij^iri according to line [13] in the algorithm. 
Then X[n — L^i+iJ +1] = X[n — \jni+iri\ + 1] > 6. By inequality (U) in the hypothesis, 

X[n - [mi+iTj+iJ + 1] = X[n - [^/mj^mlrij + 1] = X[n - [m^rij + 1] < 6. 

ii. Case 2: X[n— [mi+irij + 1] < fe. Let r^+i = according to line [T3] the algorithm. We have 
X[n — [r^+ij + 1] = X[n — [r^J + 1] > 6 by inequality ^ in the hypothesis. By inequality ^ 
in the hypothesis, X[n — [mi+ir.i+ij +1] = X[n ~ [m^+irij + 1] < 6 by the condition of this 
case. 

Therefore, X[n — [r.i+ij + 1] > 6 and X[n — [mi+iri+ij +1] < 6 at the end of cycle i + 1 of the 
second while loop. 

Every number in X[n — + 1, which has r,; entries, is at least 6, and X[n — miVi + 1, n] has 
rriiri entries and mi < \ + 5 sX the end of the algorithm. Thus, the interval [n — m^ri + 1, n] returned 
by the algorithm is an (1 + (5)-approximation for the 6-region. 

It takes O(loglogn) steps for converting m to be at most 2, and additional log j- steps to make 
TO to be at most 1 + 5. When to^ < 1 + (5, wc stop the loop, and output an (1 + (5)-approximation. 
This step takes at most 0(log j- + log log n) time since to^ is assigned to y/rnl at each cycle of the 
second loop. This proves Lemma 21 I 

After the first loop of the algorithm Approximate- Region(.), the number to is always of the 
format 2^ for some integer k. In the second loop of the algorithm Approximate- Region(.), the 
number to is always of the format 2^ when m is at least 2. Computing its square root is to convert 
2^ to 2^ , where k is an integer. Since (1 -I- i-) • (1 4- ^) > (1 4- 2^^)' have that (1 -I- i-) is 
larger than the square root of (l+2i^). We may let variable to^ go down by following the sequence 
{(1 + 5T)}i^i after to; < 2. In order words, let g{.) be an approximate square root function such that 
g{l + = 1 + -^TTT foi' computing the square root after to < 2 in the algorithm. It has the property 
g{m) ■ g{m) > m. The assignment m^+i = y^ml can be replaced by to^+i — g{mi) in the algorithm. 
It can simplify the algorithm by removing the computation of square root while the computational 
complexity is of the same order. 
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2.2. Approximate Sum 

We present an algorithm to compute the approximate sum of a hst of sorted nonnegative elements. 
It calls the module for the approximate region, which is described in Section [53] 

The algorithm for the approximate sum of a sorted list X of nonnegative n numbers generates a 
series disjoint intervals i?i = [ri, r'j^], • • • , i?j = [r^, r^], and a scries of thresholds bi, ■ ■ ■ ,bf such that 
each Ri is an (1 + (5)-approximate 6i-region in X[l, r^], r[ = n, r'-^^ = — 1. and bi^i < j^, where 
S = ^ and 1 + e is the accuracy for approximation. The sum of numbers in X[Ri] is approximated 
by \Ri\bi. As the list bi > b2 > ■ ■ ■ > bt decreases exponentially, we can show that t O(ilogn). 
The approximate sum for the input list is We give a formal description of the algorithm 

and its proof below. 

Algorithm Approximate-Sum(X, e, n) 

Input: X[l,n] is a sorted list of nonnegative numbers (by nondecreasing order) and n is the size 
of ^"[1, n], and e is a parameter in (0, 1) for the accuracy of approximation. 



1. if {X{n) = 0), return 0; 

2. let 6 := 

3. let r[ := n; 

4. let s := 0; 

5. let i := 1; 

6. let &i 

7. while (6, > HM) { 

8. let Ri :=Approximate-Rcgion(Ar, 6i, (5, rQ; 

9. let r'^^-^ := - 1 for R, = [ri,r[]; 

10. let ^g±il; 

11. let Sj := |[ri,r,^]| • b^] 

12. let s := s + si; 

13. let i := i + 1; 

14. }; 

15. return s; 



End of Algorithm 

Theorem 5. Let e be a positive parameter. Then there is an min(logrt, log( ^'"°°' )) • (log ^ + 
log log time algorithm to compute {1 + e)- approximation for the sum of sorted list of nonnegative 
numbers, where Xmax o,nd Xmin ire the largest and the least positive elements of the input list, 
respectively. 
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Proof: Assume that there are t cycles executed in the while loop of the algorithm Approximate- 
Sum(.). Let regions Ri,R2, • • • , i?f be generated. In the first cycle of the loop, the algorithm finds a 
region Ri ~ [ri , n] of the elements of size at least . In the second cycle of the loop, the algorithm 
finds region i?2 = [r2,ri ~ 1] for the elements of size at least ^^y+T^' cycle of the loop, 

it finds a region Ri = [ri,ri_i — 1] of elements of size at least ^^^l^^g~^^ ■ By the algorithm, we have 

j e i?i U i?2 U • • • U i?t for every j with X\j] > ^^M. (5) 

3n 

Since each Ri is an (1 + (5)-approximation of -region in X[l,ri-i — 1], contains 

at least entries of size at least '^^^I'^^g^^^ hi Ar[l, r^-i — 1], Ri also contains every entry of size at 
least in X[l,n^i - 1]. Thus, 

" < E ^ - 1] = (1 + s)-^- 



1 + 6 1 + 6 1 + 6 . ^ 



Thus, 



We have 

Y^E^w^^'^(i + '^)E^bi- (6) 

jeRi jeR. 

Thus, Si is an (1 + (5)-approximation for X^jei? -^[j]- We also have 7^ , sx[n] X[i\ < ^^^^ since 
X[l,ri] has only n numbers in total. Therefore, we have the following inequalities: 

^ x[q = ^^M- E (7) 

^ E^M-gE^W (8) 

= (1- (9) 

i=i 

We have the inequalities: 

t 

s = E^^ (10) 



1=1 



^ E (by inequality ©)) (11) 

js:[i]>MM 

^ Tri^^^t*^ (by inequality ®) (12) 

i—1 



1 

1+1: E^M (13) 
1 " 

^E^w (14) 



1 -t- T3X «=1 

^ 3 
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r^E^W (15) 

^ 3 1=1 
1 " 



i=l 

As i?2, ■ ■ ■ are disjoint each other, we also have the following inequalities 



s = (17) 

i=l 
t 

^ Y.^l + 5)Y,X[j\ (by inequality ®) (18) 

n 

< {l+S)Y,X[j] (19) 

n 

< {l + e)Y,X[j]. (20) 

Therefore, the output s returned by the algorithm is an (1 + e)-approximation for the sum 
X]r=i "^W- LemmaSl each cycle in the while loop of the algorithm takes 0((log | + log logn)) 

time for generating Ri. For the descending chain r[ > r2 > ■ ■ ■ > r'^ with X[r'^ < ^j'l"^^^ and 
hi ~ X[r[] > ^^l^^ for each i, we have that the number of cycles t is at most 0(|-logn). This is 
because X[r'^\ < ^{^}y < ^^^^ for some t = logn). Similarly, the number of cycles t is at most 
0(i log(f^)) because X[r^] < < x„,in for some t = 0(i log(|^)). 

Therefore, there are most t = 0(| min(logn, log )) cycles in the while loop of 
the algorithm. Therefore, the total time is 0(| min(logn, log( ^"'°^ ))(log | + loglogn)) = 
0(i min(log n, log(^^2^))(log i + loglog n)). This proves Theorem [SJ I 



3. Lower Bounds 

In this section, we show several lower bounds about approximation for the sum of sorted list. The 
ri(min(log n, log( ^™'"° )) lower bound is based on the general computation model for the sum problem. 
The lower bound 51 (log log n)) for finding an approximate 6-region shows that upper bound is optimal 
if using the method developed in Section [2l We also show that there is no sublinear time algorithm 
if the input list contains one negative element. 

3.1. Lower Bound for Computing Approximate Sum 

In this section, we show a lower bound for the general computation model, which almost matches 
the upper bound of our algorithm. This indicates the algorithm in Section [3] can be improved by at 
most O(loglogn) factor. 

The lower bound is proved by a contradiction method. In the proof of the lower bound, two lists 
Li and L2 are constructed. For an algorithm with o(log n) queries, the two lists will have the same 
answers to all queries. Thus, the approximation outputs for the two inputs Li and L2 are the same. 
We let the gap of the sums from the two lists be large enough to make them impossible to share the 
same constant factor approximation. 

Theorem 6. For every positive constant d > 1, every d- approximation algorithm for the sum of a 
sorted list of nonnegative numbers needs at least r^(min(log n, log )) (adaptive) queries to the 
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list, where 7 is an arbitrary small constant in (0,1), where Xmax o,nd Xmin O'Te the largest and the 
least positive elements of the input list, respectively. . 

Proof: We first set up some parameters. Let 

c = (4 + (5)^2, (21) 
3 

a = — , and (22) 

4 log c ^ ' 

P = \^ (23) 

where 5 is an arbitrary small constant in (0, 1). Let to be a positive integer. 

Let Lq be a list of t numbers equal to h with h < c and t ■ h < Smd^ , where h, t, and 5 will be 
determined later. 

Let list Ri contain c™~' identical numbers equal to c' for i ~ 1,2,---,to. Let the first list 
L[ = R1R2 ■ ■ ■ Rm, which is the concatenation of _Ri, i?2, • • • , and Rm- The list L'l has n' = c™~-^ + 
c™~^ + ■ ■ ■ + c+1 = '^c-i numbers. We have rt' < c™ as c > 2. Assume that an algorithm A{.) only 
makes at most f3m queries to output a d-approximation for the sum of sorted list of nonnegative 
numbers. 

Let A(Li) represent the computation of the algorithm A{.) with the input list L,;. During the 
computation, A(.) needs to query the numbers in the input list. Let L'2 = R'iR'2 • • • R'rm where R'^ 
has the same length as Ri and is derived from Ri by the following two cases. 

Let Li ~ LqL[ for i = 1, 2. 

• Case 1: Rk in Li has no clement queried by the algorithm A{Li). Let be a list of \Rk\ 
identical numbers equal to that of Rk+i (note that each element of Rk+i is equal to 0*^+^). 
Since i?^. has c™~'"' numbers equal to c'^"'"^, the sum of numbers in i?^. is c™^^ ■ c^^^ = c™+^. 

• Case 2: Rk in Li has at least one element queried by the algorithm A{Li). Let R'^. ~ Rk. 

It is easy to verify that L2 is still a nondecreasing list. The number of RiS that are not queried 
in A(Li) is at least (m — /3to), as the number of queried elements is at most (3m. 

Let Si be the sum of elements in Li, and S2 be the sum of elements in L2. We have Si < 
{5 + 1)toc™, and S2 > {m — Pm)c^~^^. The two lists Li and L2 have the same result for running 
the algorithm. Assume that the algorithm gives an approximation s for both Li and L2. We have 

s < dSi < d{l + S)m.c''' ioT Li,and (24) 
i(TO-/3TO)c™+i < forLa. (25) 

By inequalities ([Ml) and (HH), we have ^(m - /3to)c'"+i < d{l + S)m,c"\ Thus, i(l - /3)c < d{l + S). 
Thus, 1 - < p. By equation (EU, we have 1 - £[!(i±£) > 1 _ 1 ^ | ^ /J. This brings a 

contradiction. Thus, the algorithm cannot give a d-approximation for the sum of sorted list with at 
most /3m queries to the input list. 

The largest number of Li and L2 is c"'. We can create the two cases for the lower bound. 

• Case 1: logn > log ^""^ . We just let Lq contains t ^ n — n' Os. We have log ^""^ = log ^ = 
(to — 1) logc. Since the algorithm has to make at least /3m = ri(log ^""^ ) queries, we can see 
a lower bound of ri(log ^""^ ). 

• Case 2: logn < log ^"""^ . Let Lq only contain one number h = (note t = 1). Since the 
algorithm has to make at least /3to = r2(log7i) queries, we can see a lower bound of ri(logn). 

I 
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3.2. Lower Bound for Computing Approximate Region 

We give an ri(loglogn) lower bound for the deterministic approximation scheme for a 6-region in a 
sorted input hst of nonnegative numbers. The method is that if there is an algorithm with o(log log n) 
queries, two sorted lists Li and L2 of 0, 1 numbers are constructed. They reply the same answer the 
each the query from the algorithm, but their sums have large difference. This lower bound shows 
that it is impossible to use the method of Section [5J which itcratively finds approximate regions via 
a top down approach, to get a better upper bound for the approximate sum problem. 

Definition 7. For a sorted list n] with 0, 1 numbers by nondecreasing order, an d-approximate 
1-region is a region R = [s,n], which contains the last position n of such that at least 

numbers in X[s,n] are f, and X[s,n] contains all the positions j with X[j] — 1, where \R\ is the 
number of integers i in R. 

Theorem 8. For any parameter d > 1, every deterministic algorithm must make at least log log n — 
loglog((i + 1) adaptive queries to a sorted input list for the d-approximate \-region problem. 

Proof: We let each input list contain either or 1 in each position. Assume that A{.) is a 
d-approximation algorithm for the approximate region. Let A{Li) represent the computation of A{.) 
with input list Li. We construct two lists Li and L2 of length n, and make sure that A{Li) and 
A{L2) receive the same answer for each query to the input list. For the list of adaptive queries 
generated by the algorithm A{.), we generate a series of intervals 

=/o 3/i 3 ••• 3/™. (26) 

We also have a list 

Kn]=/o^C/f C...C/,^, (27) 

where m is the number of queries to the input list by the algorithm A{.) and each /^^ is a subset of 
Ij for J = 0, 1, 2, • • • , m. 

For each /j, it is partitioned into U such that its right part /j^ is for 1, and its left part 
is undecided except its leftmost position. Furthermore, 

|/,|>n^|/f|, (28) 

and both Ij and R^ always contain the position ?i, which is the final position in the input list. 

Stage 

let lo ■■= [l,ri]; 

let /(f := [n, n]; 

let Li[l] :=L2[1] := 0; 

let Li[n] := L2[n] := 1; 

mark every 1 < i < n as a "undecided" position (1 and n are already decided); 
End of Stage 0; 

It is easy to see that inequality (|28p holds for Stage j = 0. 

For an interval [a,b], \[a,b]\ is the number of integers in it as defined in Definition [1] Assume 
that Ij = [aj , n] and Ip = [bj , n] . We assume that inequality holds for j . We also assume that 
both Li[i] and L2[i] have been decided to hold for each i < aj; both Li[i] and L2[i] have been 
decided to hold 1 for each i>bj\ and the other points are undecided after stage which processes 
the j-query. 

Stage j + 1 (j > 0) 

Assume that a position p is queried to the input list by the j + 1-th query {j > 0) made by the 
algorithm A{.). We discuss several cases. 
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Case 1: p < aj. Let Ij+i :— Ij and Ij^ . We have 

1^+1 1 _ l^.l 



> > 122^ 



Let the answer to the j + 1-th query be as we already assigned Li[p] := L2[p] := in the 
earher stages by the hypothesis. 



Case 2: p > aj and p e Ij^ . Let /j+i := and /^^. We have 

> n 23" > n 23TT ^ i^jj^Q hypothesis) 



l^j+il _ 1^1 

Let the answer to the j + 1-th query be 1 as we already assigned Li[p] := L2[p] := 1 in the 
earlier stages by the hypothesis. 

Case 3: p > and p ^ 1^ and > Let /j+i and /j^^ := 1^. We still 

have 



Let the answer to the j + 1-th query be 0, as the position p will hold the number 0. Let 
Li[i] := L2[i] '■— for each undecided i < p (it becomes "decided" after the assignment). 

Case 4: p > a,- and p ^ if and -Hwr < < jjm- Let := and := [p, n]. We have the 



inequalities 



1^+1 1 _ 1^1 _ W\ 



\[p,n]\ M 

' 3 ' 



(29) 



|/«| 

> — — (by the condition of this case) (30) 



— J > \l 7121 = rt2Ai . (by the hypothesis) (31) 

V Ivl 

Let the answer to the j + 1-th query be 1, as the position p will hold the number 1. Let 
Li[i] := L2[i] '■— 1 for each undecided i > p (it becomes "decided" after the assignment). 

End of Stage j + 1 

Assume that there are m queries. The following final stage is executed after processing all the 
TO queries. 

Final Stage 

assume that = [a™ , n] and = [b,n , n] . 

let Li[i] := for every undecided i < b^, and let Li[i] = 1 for every undecided i > 6„; 
let L2W ■= for every undecided i < 0^, and let Li[i] = 1 for every undecided i > a^; 
End of Final Stage 
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We note that the assignments to the two lists Li and L2 arc consistent among all stages. In 
other words, if Li[j] is assigned a € {0, 1} at stage k, then Li[j] will not be assigned 6 ^ a at any 
stage k' with k < k' , because of the two chains (pS)) and ([?7)) in the construction. 

The two deterministic computations A{Li) and A{L2) have the same result. We get two sorted 
lists Li and L2 such that each position in of Li is 1, every other position of Li is 0, each position 
in [a„i + l^n] oi L2 is 1, and every other position of ^2 is 0, where Im = [dm, n]. 

On the other hand, the numbers of Is of Li and L2 are greatly different. Let D be the approximate 
1-region outputted by the algorithm for the two lists. As the algorithm gives a d-approximation for 
Li, we have 

^<K^:I- (32) 

As D is a d-approximate 1-region for L2, D contains every j with X\j] = 1 (see Definition [7]) . We 
have 

|/™|-1<|^|. (33) 

By inequalities 1^ and |/™| - 1 < d\I^\. Therefore, -l^ffj^p- < d. Thus, ^ < d + 1 as 

|/^| > 1. We have < d + 1. This implies m > log log n — loglog(d + 1). I 

Corollary 9. For any constant e G (0,1), every deterministic 0{l)- approximation algorithm for 
1-region problem must make at least (1 — e) log log n adaptive queries. 

3.3. Lower Bound for Sorted List with Negative Elements 

We derive a theorem that shows there is not any factor approximation sublinear time algorithm for 
the sum of a list of elements that contains both positive and negative elements. 

Theorem 10. Let e be an arbitrary positive constant. There is no algorithm that makes at most 
n — 1 queries to give (1 + e)- approximation for the sum of a list of n sorted elements that contains 
at least one negative element. 

Proof: Consider a list of element —m{m + 1), 2, • • • , 2m. This list contains n = m + 1 elements. 
If there is an algorithm that gives (1 + e)-approximation, then there is an element, say 2k, that is 
not queried by the algorithm. 

We construct another list that is identical to the last list except 2k being replaced by 2fc + 1. 

The sum of the first list is zero, but the sum of the second list is 1. The algorithm gives the same 
result as the element 2k in the first list and the element 2k + 1 in the second list are not queried (all 
the other queries are the of the same answers). This brings a contradiction. 

Similarly, in the case that — m(m+ 1) is not queried, wc can bring a contradiction after replacing 
it with — m(m + 1) + 1. I 



4. Implementation and Experimental Results 

As computing the summation of a list of elements is widely used, testing the algorithm with program 
is important. Our algorithm has not only theoretical guarantee for its speed and accuracy, but also 
simplicity for converting into software. We have implemented the algorithm described in Section [5] 
It has the fast performance to compute the approximate sum of a sorted list with nonnegative real 
numbers. As the algorithm is simple, it is straight to convert it into a CH — h program, which shows 
satisfactory performance for both the speed and accuracy of approximation. 

In the experiments conducted, we set up a loop to compute the summation of n = 10^ elements. 
The loop is repeated k — 100 times. The approximation algorithm is much faster than the brute 
force method to compute the approximate sum. 
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In order to avoid the memory limitation problem, we use an nondecreasing function x{.), instead 
of a list, from integers to double type floating point numbers. There is a function "double approxi- 
mate_sum(double (*x)(int), double e, int n)". If we let function x{i) return the i-the element of an 
input list, it can also handle the input of a list of numbers, and compute its approximate sum. In 
order to avoid the time consuming computation for the square root function, we set up a table of 30 
entries to save the values for 2^ with integer k G [—20, 9]. This table is enough to handle e as small 
as 10~^ without calling library function sqrt{.) to compute the square root, and n as large as 2^ . 

When the number n of numbers of the input is fixed to be 10^, the speed of the software depends 
on the accuracy 1 + e. We let x{i) = i during the experiments. For parameter e = 0.1,0.01,0.001 
and 0.0001, our algorithm for the approximate sum is much faster than the brute force method, 
which computes the exact sum. 

Our algorithm may be slower than the brute force method when e is very small (for example 
e = 0.00001). This is very reasonable from the analysis of the algorithm as the complexity is inversely 
propositional to e, and the algorithm Approximate-Sum(.) generates a lot of regions Ri with only 
one position. 

5. Conclusions and Open Problems 

We studied the approximate sum in a sorted list with nonnegative elements. For a fixed e, there is a 
log log n factor gap between the upper bound of our algorithm, and our lower bound. An interesting 
problem of further research is to close this gap. Another interesting problem is the computational 
complexity of approximate sum in the randomized computational model, which is not discussed in 
this paper. 
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